46 research outputs found

    Scale Invariant Interest Points with Shearlets

    Full text link
    Shearlets are a relatively new directional multi-scale framework for signal analysis, which have been shown effective to enhance signal discontinuities such as edges and corners at multiple scales. In this work we address the problem of detecting and describing blob-like features in the shearlets framework. We derive a measure which is very effective for blob detection and closely related to the Laplacian of Gaussian. We demonstrate the measure satisfies the perfect scale invariance property in the continuous case. In the discrete setting, we derive algorithms for blob detection and keypoint description. Finally, we provide qualitative justifications of our findings as well as a quantitative evaluation on benchmark data. We also report an experimental evidence that our method is very suitable to deal with compressed and noisy images, thanks to the sparsity property of shearlets

    Shortcuts for causal discovery of nonlinear models by score matching

    Full text link
    The use of simulated data in the field of causal discovery is ubiquitous due to the scarcity of annotated real data. Recently, Reisach et al., 2021 highlighted the emergence of patterns in simulated linear data, which displays increasing marginal variance in the casual direction. As an ablation in their experiments, Montagna et al., 2023 found that similar patterns may emerge in nonlinear models for the variance of the score vector logpX\nabla \log p_{\mathbf{X}}, and introduced the ScoreSort algorithm. In this work, we formally define and characterize this score-sortability pattern of nonlinear additive noise models. We find that it defines a class of identifiable (bivariate) causal models overlapping with nonlinear additive noise models. We theoretically demonstrate the advantages of ScoreSort in terms of statistical efficiency compared to prior state-of-the-art score matching-based methods and empirically show the score-sortability of the most common synthetic benchmarks in the literature. Our findings remark (1) the lack of diversity in the data as an important limitation in the evaluation of nonlinear causal discovery approaches, (2) the importance of thoroughly testing different settings within a problem class, and (3) the importance of analyzing statistical properties in causal discovery, where research is often limited to defining identifiability conditions of the model

    human motion understanding for selecting action timing in collaborative human robot interaction

    Get PDF
    In the industry of the future, so as in healthcare and at home, robots will be a familiar presence. Since they will be working closely with human operators not always properly trained for human-machine interaction tasks, robots will need the ability of automatically adapting to changes in the task to be performed or to cope with variations in how the human partner completes the task. The goal of this work is to make a further step toward endowing robot with such capability. To this purpose, we focus on the identification of relevant time instants in an observed action, called dynamic instants, informative on the partner's movement timing, and marking instants where an action starts or ends, or changes to another actions. The time instants are temporal locations where the motion can be ideally segmented, providing a set of primitives that can be used to build a temporal signature of the action and finally support the understanding of the dynamics and coordination in time. We validate our approach in two contexts, considering first a situation in which the human partner can perform multiple different activities, and then moving to settings where an action is already recognized and shows a certain degree of periodicity. In the two contexts we address different challenges. In the first one, working in batch on a dataset collecting videos of a variety of cooking activities, we investigate whether the action signature we compute could facilitate the understanding of which type of action is occurring in front of the observer, with tolerance to viewpoint changes. In the second context, we evaluate online on the robot iCub the capability of the action signature in providing hints to establish an actual temporal coordination during the interaction with human participants. In both cases, we show promising results that speak in favour of the potentiality of our approach

    Modeling Visual Features to Recognize Biological Motion: A Developmental Approach

    Get PDF
    In this work we deal with the problem of designing and developing computational vision models – comparable to the early stages of the human development – using coarse low-level information. More specifically, we consider a binary classification setting to characterize biological movements with respect to non-biological dynamic events. To this purpose, our model builds on top of the optical flow estimation, and abstract the representation to simulate the limited amount of visual information available at birth. We take inspiration from known biological motion regularities explained by the Two-Thirds Power Law, and design a motion representation that includes different low-level features, which can be interpreted as the computational counterpart of the elements involved in the law. Our reference application is human-machine interaction, thus the experimental analysis is conducted on a set of videos depicting two different subjects performing a repertoire of dynamic gestures typical of such a setting (e.g. lifting an object, pointing, ...). Two slightly different viewpoints are considered. The contribution of our work is twofold. First, we show that the effects of the Two-Thirds Power Law can be appreciates on a video analysis setting. Second, we prove that, although the coarse motion representation, our model allows us to reach biological motion classification performances (around 89%) which are reminiscent of the abilities of very young babies. Moreover, our model shows tolerance to view-point changes

    Robots with Different Embodiments Can Express and Influence Carefulness in Object Manipulation

    Full text link
    Humans have an extraordinary ability to communicate and read the properties of objects by simply watching them being carried by someone else. This level of communicative skills and interpretation, available to humans, is essential for collaborative robots if they are to interact naturally and effectively. For example, suppose a robot is handing over a fragile object. In that case, the human who receives it should be informed of its fragility in advance, through an immediate and implicit message, i.e., by the direct modulation of the robot's action. This work investigates the perception of object manipulations performed with a communicative intent by two robots with different embodiments (an iCub humanoid robot and a Baxter robot). We designed the robots' movements to communicate carefulness or not during the transportation of objects. We found that not only this feature is correctly perceived by human observers, but it can elicit as well a form of motor adaptation in subsequent human object manipulations. In addition, we get an insight into which motion features may induce to manipulate an object more or less carefully.Comment: Accepted for publication in the Proceedings of the IEEE International Conference on Development and Learning (ICDL) 2022 - 12th ICD

    Detecting Biological Motion for Human-Robot Interaction: A Link between Perception and Action

    Get PDF
    One of the fundamental skills supporting safe and comfortable interaction between humans is their capability to understand intuitively each other's actions and intentions. At the basis of this ability is a special-purpose visual processing that human brain has developed to comprehend human motion. Among the first "building blocks" enabling the bootstrapping of such visual processing is the ability to detect movements performed by biological agents in the scene, a skill mastered by human babies in the first days of their life. In this paper, we present a computational model based on the assumption that such visual ability must be based on local low-level visual motion features, which are independent of shape, such as the configuration of the body and perspective. Moreover, we implement it on the humanoid robot iCub, embedding it into a software architecture that leverages the regularities of biological motion also to control robot attention and oculomotor behaviors. In essence, we put forth a model in which the regularities of biological motion link perception and action enabling a robotic agent to follow a human-inspired sensory-motor behavior. We posit that this choice facilitates mutual understanding and goal prediction during collaboration, increasing the pleasantness and safety of the interactio

    A prototype application for long-time behavior modeling and abnormal events detection

    No full text
    In this work we present a prototype application for modelling common behaviours from long-time observations of a scene. The core of the system is based on the method proposed in (Noceti and Odone, 2012), an adaptive technique for profiling patterns of activities on temporal data - coupling a string-based representation and an unsupervised learning strategy - and detecting anomalies - i.e., dynamic events diverging with respect to the usual dynamics. We propose an engineered framework where the method is adopted to perform an online analysis over very long time intervals (weeks of activity). The behaviour models are updated to accommodate new patterns and cope with the physiological scene variations. We provide a thorough experimental assessment, to show the robustness of the application in capturing the evolution of the scene dynamics

    Human in groups: the importance of contextual information for collective activities classification

    No full text
    In this work we consider the problem of modeling and recognizing collective activities performed by groups of people sharing a common purpose. For this aim we take into account the social contextual information of each person, in terms of the relative orientation and spatial distribution of people groups. We propose a method able to process a video stream and, at each time instant, associate a collective activity with each individual in the scene, by representing the individual \u2013 or target \u2013 as a part of a group of nearby people \u2013 the target group. To generalize with respect to the viewpoint we associate each target with a reference frame based on his spatial orientation, which we estimate automatically by semi-supervised learning. Then, we model the social context of a target by organizing a set of instantaneous descriptors, capturing the essence of mutual positions and orientations within the target group, in a graph structure. Classification of collective activities is achieved with a multi-class SVM endowed with a novel kernel function for graphs. We report an extensive experimental analysis on benchmark datasets that validates the proposed solution and shows significant improvements with respect to state-of-art results
    corecore